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This study reports on the capabilities of 53 Year 4 students as they completed the final stages 
of their first complete statistical investigation. In the context of becoming acquainted with 
students in a sister school in another city, students in both schools wrote and refined 
questions, which were answered by all students in an on-line survey. Using data from both 
schools, students chose at least one question to analyse and present their findings comparing 
the two cities and/or groups of students. Of interest are the representations created for the 
questions chosen, the conclusions drawn, the variation observed, the certainty about the 
conclusion, and the comments about what had been learned about writing survey questions. 


In today’s world of Big Data statistics educators face the task of preparing statistically 
literate citizens, as well as inspiring some to become professional statisticians. For some 
time (e.g., Wallman, 1993) there have been calls to recognise the need for a statistically 
literate population to make meaningful decisions, both personally, and in society. Gal (2002) 
extended the requirement to include the ability to communicate concerns about suspicious 
claims. At the school level, Watson (2006) suggested that for judging a particular claim, 
students need to know the terminology employed, understand how it is used in the context 
of the claim, and have the critical thinking skills to judge the merit of claims. 

A question for educators in schools is, “how do we help students develop the skills and 
understanding needed to become statistically literate?” The answer from the American 
Statistical Association (ASA) is found in the Guidelines for Assessment and Instruction in 
Statistics Education (GAISE) Report (Franklin et al., 2007). GAZSE outlines the steps in 
“statistical problem-solving” in such a way that students learn the Practice of Statistics 
(Moore & McCabe, 1989). As Moore and McCabe claim, “Statistics in practice is concerned 
with gaining understanding from data; it is focused on problem-solving rather than on 
methods...” (p. xi). GAISE also recognises that variation is the underlying phenomenon at 
every step of the practice: (1) Formulate Questions, anticipating variability, (ii) Collect Data, 
acknowledging variability, (111) Analyse Data, accounting for variability, and (iv) Interpret 
results, allowing for variability. The aim is for students to use data to make informal 
inferences for populations, analysing the evidence they have from samples, acknowledging 
uncertainty (Makar & Rubin, 2009). 

As well as the general need for a statistically literate population, the growing recognition 
of the importance of STEM (Science, Technology, Engineering and Mathematics) fields for 
solving a nation’s economic and environmental problems (e.g., Office of the Chief Scientist, 
2013), combined with the emergence of the field of Data Science itself (Finzer, 2013), 
increases pressure to employ more professional statisticians. The ASA (2017) claims that 
the expected employment growth for statisticians in the United States between 2014 and 
2024 is 34%, compared with 7% for the average of all occupations. In Australia and New 
Zealand, it appears “the shortage of statisticians is a worsening problem” (Cameron, Iosua, 
Parry, Richards, & Jaye, 2017, p. 367). These imperatives motivate activities that help 
students develop the skills to be able to engage in all of the steps of the practice of statistics, 
thereby, engaging in the work of statisticians. 


2019. In G. Hine, S. Blackley, & A. Cooke (Eds.). Mathematics Education Research: 
Impacting Practice (Proceedings of the 42"4 annual conference of the Mathematics 
Education Research Group of Australasia) pp. 739-746. Perth: MERGA. 


Background 


Research that reports student outcomes and capabilities in relation to working through 
all of the steps of the practice of statistics is scant. Of particular relevance, is a study 
conducted by Paparistodemou and Meletiou-Mavrotheris (2008), who worked with Year 3 
students in Cyprus to introduce informal inference with data generated from a survey where 
they asked simple questions of classmates. The young students in that study demonstrated 
their capacity to draw conclusions based on the data and the situation within which the data 
were collected, relate the conclusions to a larger population, and identify the uncertainty 
associated with the conclusions. Later the researchers tested a hypothetical learning 
trajectory based on these results with Year 6 students to illustrate the feasibility of primary 
children being able to undertake the practice of statistics. This time students asked questions 
about eating habits. The authors documented six ways in which most students improved their 
reasoning about samples and sampling in the context of drawing conclusions, including the 
importance of sample size, the need to avoid sample bias, and the opportunity to increase 
representativeness with stratification (Meletiou-Mavrotheris & Paparistodemou, 2015). The 
contexts chosen by the researchers and subsequently the questions chosen by the students 
were based in the social sciences. 

Watson and English (2015, 2018), however, chose STEM-related contexts when they 
worked with students in Years 5 and 6 to demonstrate the students’ capacity for carrying out 
the practice of statistics with authentic data. First, in Year 5, students considered the 
question, “Are we environmentally friendly?”, with each student deciding the percentage of 
“ves” responses to five sub-questions about their habits with respect to sustainability that 
would be required to answer the main question in the affirmative. The students collected and 
analysed data first for their class and then for Australia using samples from an Australian 
Bureau of Statistics (ABS) “population” of 1300 Year 5 students. In Year 6, students were 
shown a claim in the media that people with brown eyes had faster reaction times than people 
with eyes of other colours. The students’ task was to explore this claim with data from their 
class and then again with random samples from an ABS “population” of 1786 Year 6 
students, using the same on-line reaction timer. Through these activities and others, students 
demonstrated increased ability to carry out the practice of statistics, as well as increased 
critical thinking in other statistical contexts, shown in longitudinal statistical literacy surveys 
carried out across the larger study (Watson, Callingham, & English, 2017). In these two 
activities, the questions posed, the first step in the practice of statistics, were determined by 
external sources, not by the students themselves. 

Although it is important for students to investigate questions raised in wider social 
contexts, the purpose of the activity reported here was to return to a relatively general context 
in which students could pose a wide range of questions of interest to themselves to 
investigate. After completing the practice of statistics, by analysing and interpreting data 
generated from questions they themselves posed, they could also be asked to reflect on their 
learning about posing questions. 


Research Approach 


The activity reported here was the third, in term 3 of Year 4, as part of a longitudinal 
project that followed the progress of students from mid-way through Year 3 to the end of 
Year 6. For the students in the project, the aim was to build and reinforce the development 
of understanding and capability related to the practice of statistics through investigations 
embedded within STEM contexts. A design-based research approach (Cobb, Confrey, 
diSessa, Lehrer, & Schauble, 2003) was adopted that used results from earlier activities to 
inform the implementation of teaching interventions as the project progressed. 
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In Year 3, the essential importance of variation in all contexts and for applying statistics 
was emphasised when students created “licorice” sticks two ways: by hand and by 
“machine” (Watson, Fitzallen, English, & Wright, 2019). Students then encountered the 
context of heat transfer, learning new ways to represent and analyse data (Fitzallen, Watson, 
& Wright, 2017; Fitzallen, Wright, Watson, & Duncan, 2016). In the third activity, described 
here, the students had active involvement in all stages of the practice of statistics. It allowed 
them to pose and refine their own questions to collect data, to compare results and to draw 
conclusions about students’ activities in the context of the two cities. The initial part of the 
activity, posing and refining the student questions, to be administered via an on-line survey, 
is described in English, Watson, and Fitzallen (2017). This paper reports on the way in which 
the students in one school analysed and interpreted the data generated from the survey and 
subsequently, reflected on the process. 


Participants 


Year 4 students from two parochial suburban schools in two Australian cities completed 
the activity. All 55 students present in two classes at one of the schools (City A) took part in 
the activity. Data are reported on the 53 students from that school whose parents consented 
to the collection of data for them. The data are deidentified and reported using unique codes 
for each student. The average age of participants was 10.0 years (range 9.1 years to 10.7 
years) and the gender split was 60% male and 40% female. The project had ethics approval 
from the Tasmania Social Sciences Human Research Ethics Committee (H0015039). 


Implementation 


The activity involved the students from both schools posing and refining questions, 
which were used to develop a survey to collect data about the students’ activities and their 
environment in both cities (English et al., 2017). After introducing themselves to the students 
in the other city through short biographies, students individually posed the questions they 
wanted answered. The questions were then refined by students to eliminate non-statistical 
questions and each class decided which questions they wanted to include in the survey. After 
a discussion of the part played by technology in collecting and compiling the data, the 
students from both schools completed the survey on-line. The survey contained 22 questions: 
7 numerical (e.g., “How many hours do you spend doing sport each week in summer?”’), 4 
multiple choice (e.g., “When is the best time of year to visit your city? a) Summer, b) 
Autumn, c) Winter, d) Spring”), 6 text box (e.g., “What do you love to do the most with your 
family on a sunny day?”), 1 yes/no (“Do you get homework?” Yes No); 3 sliding scale 
(e.g., “On a scale of 1 to 10, how much do you like visiting your botanical gardens?’’); and 
1 ranked (“Rank these Australian birds and animals by how much you like them ... Koala, 
Kangaroo, Wombat, Cassowary, Tasmanian Devil, Kookaburra”). This process provided 
students with approximately 85 data values for each question to analyse to compare the two 
cities and the students’ activities. 

The data from the survey questions were distributed to students working in groups of 
three and students negotiated the choice of which questions to analyse. Although individual 
students may have chosen different questions to analyse, discussion was encouraged among 
members of the group. Students were first asked to produce a representation of their data 
and to answer questions about the process undertaken by writing in their workbooks. Finally, 
the students presented their results to the class and there was general discussion about what 
they had learned about differences and similarities between the two cities and groups of 
students. 
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Data Analyses 


For the purposes of this paper, the data collection instruments were the student 
representations and their responses to four questions in their workbooks (Table 1). These 
tasks were intended to monitor understanding of the three major steps in the practice of 
statistics (Franklin et al, 2007) related to the activity. Posing and refining questions, which 
was reported in detail in English et al. (2017), was reviewed in Q4 in terms of what students 
had learned on this aspect. The representation drawn and Q2 reflected the Analysis 
undertaken, including consideration of variation, with Q1 and Q3 covering the Interpretation 
of results and the confidence in the decision made. To analyse the data, rubrics were devised 
by the authors for the representations and the four questions, as shown in Table 1. Generally, 
the descriptors for each level, as they incremented, reflected the increasing amount of 
evidence drawn from the data in the context of the question. Not all students answered every 
question in the workbook, hence the sample size varies for the questions. Coding was carried 
out by the third author and an experienced research assistant, with discrepancies in coding 
resolved by discussion. The level of initial agreement on the coding was 74%. Scores for the 
representations and QI, Q2, and Q3 in the workbook were summed for students who 
completed all three questions, to give an indication of the range of capacity to engage in the 
Analysis and Interpretation steps of the practice of statistics. The two codes for Q4 were 
combined for students who gave at least one response, monitoring reflection on writing and 
refining questions in the light of the purpose for the data collection. 


Table 1 
Rubrics for analysing student representations and responses to their chosen survey question 
Code _ Description 
Representation drawn 2 Conventional graph type/table appropriately summarising 


from the data for chosen 


the data 


survey question 1 Informal representation adequately summarising the data 
0 _Insufficient organisation or incomplete data 
Question Code Description 
Q1. What do the data and 2 Justified reason showing recognition of similarities or 
your representation tell differences 
you about life in City A 1 Statement that they are similar/different without giving a 
compared to life in City reason 
B? What conclusions 0 — Response does not summarise the data to form a 
have you reached? conclusion; Idiosyncratic response 
Q2. Were all the data the 2 __ Appropriate description of variation 
same? Describe any 1 General statement without description of variation 
variation you found. 0 _ Idiosyncratic response 
Q3. How certain are you 3 Recognition that the survey is a sample of the population 
that your data and and hence uncertainty (explicitly or implicitly stated) 
conclusion are true for all 2 Recognition that the survey may not have been answered 
Grade 4 students in City correctly/truthfully, leading to uncertainty 
A and City B? 1 General statement of uncertainty without reason, OR 
statement of certainty with valid reason (e.g., because they 
used/checked the data carefully) 
Q _Certainty without a valid reason; Idiosyncratic response 
Q4. From doing this 2 The type of question/way it is worded is important; OR 


activity, what are two 
things you have learned 


different types of survey question lead to different 
data/output; OR response focuses on data 
collection/analysis, OR consistency/variation. 


742 


about writing survey 1 General statement about ease or difficulty of writing survey 


questions? questions; OR response relates to specific survey item/s, 
[NB: Each response was rather than surveying in general; OR statement of need to 
coded separately. | check carefully. 


0 Survey writing is “fun”; Idiosyncratic response 


Results 


The overall activity was designed to develop students’ skills in the practice of statistics: 
(i) posing and refining statistical questions in developing a survey to learn more about their 
peers in each city (specifically, to compare their respective city lives), (11) collecting data by 
answering questions on-line, (iii) analysing the data by making representations, identifying 
variation, and looking for trends in the data, and (iv) drawing conclusions and inferences 
while acknowledging uncertainty. Data about posing and refining questions from step (i), 
and the collecting data component of step (ii) are reported in English et al. (2017). This paper 
extends those results by reporting on the students’ reflections on what they had learnt from 
writing survey questions. The remainder of the results are related to the Analyse Data and 
Interpret Results steps of the practice of statistics. The data presented here are the outcomes 
for the students from one of the cities (City A). 

Data representations. Fifty-three students completed at least one representation of the 
data for the questions given to their group. Coding was based on the most complete 
representation presented (see examples in Figure 1). Overall, 43% of the representations 
were recorded tallies, whereas 32% were bar/value graphs, and 14% were some form of data 
summary list. Others were considered idiosyncratic. Sixty-four percent of the representations 
were assigned Code 2, 23% Code 1, and 13% Code 0. Although perhaps informal, it was 
encouraging to find that 87% of students realised the importance of displaying all of the data. 


m | hogs Ue HAT Hh Ht HN 
6-4 h Sa LLL adic x 
a ROH Ly eee  Bebere 
Sa eeies ateea All cry BeBe CYA odllh 
WL | ve ane ne hours TH HHT Hit tt He Pr abe LER StUtbor 2 
Uh | mle oF rea ALE Her Het eat Ht BT HT HT | OULL ~ they Wa 
Feuly sce CAT HIT eee Ae et ttt H CityB 42 1 IN fale 
= ne cts 2 ft 
iT 
(ees | __ fer] 
ES She en they, 
Code 2: How do you travel to Code 1: Do you get homework? | Code 0: Rank how much you like 
school? Yes or No. Australian birds and animals. 


Figure 1. Examples of representations created for questions chosen. 


Workbook responses. Examples of responses and percentages for the other questions at 
different code levels are given in Table 2. As seen in Table 2, not all students completed 
every question in the workbook. Thirteen students left between one and three questions blank 
but only two of these students scored zero on the workbook part of the activity; both, 
however, contributed to the subsequent class discussions. The distribution of total scores for 
the representation, QI, Q2, and Q3, for the 39 students who answered every question is 
shown in Figure 2. Seventy-seven percent of students who completed the representation and 
the 3 questions, received more than half marks, indicating that they provided complete 
responses for at least 2 tasks. The results for Q4, reflecting on what students had learned 
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about writing survey questions, show this meta-thinking was difficult at this age. For the 52 
students who suggested at least one thing they had learned about writing survey questions, 
the modal score was 2, with a mean and median of 1.5, out of 4 possible points. 


Table 2 
Examples of student responses to questions about the analysis of the survey question chosen 
Question Code _ Examples 
Q1. What do the 2 [City A] is more sporty in winter which is weird because winter 
data and your (76%) there is hot winter here is cold. [ID143] 
representation tell Most people in [City B] like flowers. Most people in [City A] like 
you about life in trees. [ID110] 
City A compared The houses in [City A] and [City B] are quite different. Bricks are 
to life in City B? pretty much the only similarity. There is a lot more wood and 
What conclusions concrete houses in [City B] than [City A]. [ID127] 
have you 1 Most people get no pocket money. [ID125] 
reached? (n=46) (13%) That [City A] and [City B] like different subjects [ID134] 
0 In [City A] it is cold and rains a lot [ID131] 
(11%) For me it would be about an hour and 45 mins. but they do it for 
couple of hours. [ID111] 
Q2. Were all the 2 No! One of them goes by train and none of us. Lots more of them 
data the same? (63%) drive to school than us. One more of us walk to school than them. 
Describe any [ID101] 
variation you No not all data was the same. They love the gardens and plants. But 
found. (n=43) we don't love gardens and plants. There is one thing that is similar 
we both love sports. [ID123] 
The most popular subject [in City A] is maths and in [City B] it is 
art. The second most popular in [City A] is art and in [City B] it is 
maths. [ID116] 
1 No! Because maybe other people like different things. [ID102] 
(21%) No the data was not the same. [ID110] 
No, because there were more differences than similarities. [ID158] 
0 Both school. [ID117] 
(16%) We both like animals and we both like facts about them. [ID128] 
Q3. How certain 3 It might not be true because some people didn't do it. [ID104] 
are you that your (30%) I wouldn't be certain because we don't know anything about the 
data and other schools. [[D152] 
conclusion are 2 Not very certain because people might have lied. [ID104] 
true forall Grade (18%) 80% because some people could have not been correct. [ID124] 
4 students in City 1 Not very certain. [ID102] 
A and City B? (30%) I checked the chart. [ID111] 
(n=40) 0 I am sure that my data is true. [ID118] 
(22%) Yes lots of people thought about the beach all the time. [ID146] 
Koala will get 1st. [D105] 
Q4. From doing 2 You need to ask questions that will give you useful data from the 
this activity, what (15%) people who you are asking the question to. [ID122] 
are two things (10%) You want a question that will get a variety of answers. [ID113] 
you have learned 1 That they aren't as easy as you think and you have to check your 
about writing (58%) _ tallies and make sure of them. [ID142] 
survey questions? (44%) It was easy to write about survey. [ID115] 
[NB: Each 0 It can be really fun. [ID105] 
response is coded (27%) The things I learnt about this lesson was all about working out and 
separately. ] (46%) for me it was all about working the different things you could learn. 


(n=52, 50) 


[ID 155] It is difficult. [ID 108] 
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Figure 2. Total scores for the Analysis and Interpretation steps (Representation, Q1, Q2, and Q3). 


Discussion and Conclusion 


The objective of the activity designed to compare the lives of students in two cities, 
reported in English et al. (2017) and here, was to create an opportunity for the students to 
experience becoming statisticians, including the aim of posing their own questions rather 
than exploring those set by others. It provided the opportunity for the students to engage 
with their fellow students in the practice of statistics, as envisaged by GAJSE (Franklin et 
al., 2007). Specifically, in terms of The Practice of Statistics: Posing Questions, the majority 
of responses considered variation and the actual data that would be produced. For the last 
question (Q4), thinking back about the process of writing questions, most students focused 
on the challenge or otherwise of posing their questions. The broad range of responses shows 
that many students did not reflect beyond their personal experience. The more sophisticated 
responses, however, related to the appropriateness of questions to generate the data expected. 
This illustrates the potential for Year 4 students to develop key understandings about posing 
questions that are foundational to designing statistical investigations (English et al., 2017). 

In terms of two steps of The Practice of Statistics: Analysing and Interpreting Data, the 
students were successful in creating representations and describing the variation they saw 
when comparing life in the two cities. The dominance of tallies used to represent the data 
suggests that the students were confident in using this graph type. The use of more 
sophisticated graphical representations by some students, however, suggests Year 4 students 
have the capacity to build a broader repertoire of graphical representations. In relation to the 
judgment of certainty, the responses were divided between commenting on not surveying 
the full population and questioning the reliability of the data. For many students, trusting the 
data and knowing that they could represent a larger group posed problems. As Watson (2006) 
suggests, recognising the relationship between a sample and a population is sophisticated. 

This activity contributed to building the foundation for being aware of data in the world 
and creating curiosity about the messages they contain. Hopefully this will be followed by 
students asking more critical questions when undertaking statistical investigations in the 
future. The overall aim to create a statistically literate society is a very long process, but it is 
through students experiencing the practice of statistics and sharing findings with their 
classmates, that students will develop the ability to participate in society as statistically 
literate citizens (Gal, 2002; Watson, 2006). A few may actually become statisticians! 
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